A parallel adaptive P3M code with hierarchical particle reordering

نویسندگان

  • Robert J. Thacker
  • Hugh M. P. Couchman
چکیده

We discuss the design and implementation of HYDRA OMP a parallel implementation of the Smoothed Particle Hydrodynamics–Adaptive PM (SPH-APM) code HYDRA. The code is designed primarily for conducting cosmological hydrodynamic simulations and is written in Fortran77+OpenMP. A number of optimizations for RISC processors and SMP-NUMA architectures have been implemented, the most important optimization being hierarchical reordering of particles within chaining cells, which greatly improves data locality thereby removing the cache misses typically associated with linked lists. Parallel scaling is good, with a minimum parallel scaling of 73% achieved on 32 nodes for a variety of modern SMP architectures. We give performance data in terms of the number of particle updates per second, which is a more useful performance metric than raw MFlops. A basic version of the code will be made available to the community in the near future.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tree–Particle–Mesh: an adaptive, efficient, and parallel code for collisionless cosmological simulation

An improved implementation of an N-body code for simulating collisionless cosmological dynamics is presented. TPM (Tree–Particle–Mesh) combines the PM method on large scales with a tree code to handle particle-particle interactions at small separations. After the global PM forces are calculated, spatially distinct regions above a given density contrast are located; the tree code calculates the ...

متن کامل

GOTPM: A Parallel Hybrid Particle-Mesh Treecode

We describe a parallel, cosmological N-body code based on a hybrid scheme using the particle-mesh (PM) and Barnes-Hut (BH) oct-tree algorithm. We call the algorithm GOTPM for Grid-of-Oct-Trees-Particle-Mesh. The code is parallelized using the Message Passing Interface (MPI) library and is optimized to run on Beowulf clusters as well as symmetric multi-processors. The gravitational potential is ...

متن کامل

A Load Balancing Package on DistributedMemory Systems and its Application

We present a tool, Bisect, for balanced decomposition of spatial domains. In addition to applying a nested bisection algorithm to determine the boundaries of each subdomain, Bisect replicates a user speciied zone along the boundaries of the subdomain in order to minimize future interactions between subdomains. Results of running the tool on the Cray T3D system using both shared memory operation...

متن کامل

Parallel Implementation of Particle Swarm Optimization Variants Using Graphics Processing Unit Platform

There are different variants of Particle Swarm Optimization (PSO) algorithm such as Adaptive Particle Swarm Optimization (APSO) and Particle Swarm Optimization with an Aging Leader and Challengers (ALC-PSO). These algorithms improve the performance of PSO in terms of finding the best solution and accelerating the convergence speed. However, these algorithms are computationally intensive. The go...

متن کامل

A Multi-Scale Electromagnetic Particle Code with Adaptive Mesh Refinement and Its Parallelization

To investigate multi-scale phenomena in space plasma including plasma kinetic effects, we started to develop a new electromagnetic Particle-In-Cell (PIC) code with Adaptive Mesh Refinement (AMR) technique. In AMR simulation, spatial grid size and time step intervals are defined according to the hierarchy levels, where high and low levels correspond to the fine and coarse grid systems, respectiv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computer Physics Communications

دوره 174  شماره 

صفحات  -

تاریخ انتشار 2006